Two-Level Scheme for Identification of the Relaxation Time Spectrum Using Stress Relaxation Test Data with the Optimal Choice of the Time-Scale Factor

The viscoelastic relaxation spectrum is vital for constitutive models and for insight into the mechanical properties of materials, since, from the relaxation spectrum, other material functions used to describe rheological properties can be uniquely determined. The spectrum is not directly accessible via measurement and must be recovered from relaxation stress or oscillatory shear data. This paper deals with the problem of the recovery of the relaxation time spectrum of linear viscoelastic material from discrete-time noise-corrupted measurements of a relaxation modulus obtained in the stress relaxation test. A two-level identification scheme is proposed. In the lower level, the regularized least-square identification combined with generalized cross-validation is used to find the optimal model with an arbitrary time-scale factor. Next, in the upper level, the optimal time-scale factor is determined to provide the best fit of the relaxation modulus to experiment data. The relaxation time spectrum is approximated by a finite series of power–exponential basis functions. The related model of the relaxation modulus is proved to be given by compact analytical formulas as the products of power of time and the modified Bessel functions of the second kind. The proposed approach merges the technique of an expansion of a function into a series of independent basis functions with the least-squares regularized identification and the optimal choice of the time-scale factor. Optimality conditions, approximation error, convergence, noise robustness and model smoothness are studied analytically. Applicability ranges are numerically examined. These studies have proved that using a developed model and algorithm, it is possible to determine the relaxation spectrum model for a wide class of viscoelastic materials. The model is smoothed and noise robust; small model errors are obtained for the optimal time-scale factors. The complete scheme of the hierarchical computations is outlined, which can be easily implemented in available computing environments.


Introduction
Numerous mathematical rheological models are used to describe the mechanical properties of viscoelastic materials [1][2][3]. The Maxwell and Kelvin-Voight models are, probably, the best-known rheological models. However, deeper insight into the complex behavior of viscoelastic materials is provided by the relaxation spectrum [2,4,5]. The relaxation spectrum is vital for constitutive models and for insight into the properties of a material, since, from the relaxation spectrum, other material functions used to describe rheological properties can be uniquely determined [4,6,7]. It is commonly used to describe, analyze, compare, and improve the mechanical properties of polymers [4,8,9], concrete [10], asphalt [11], rubber [12], wood [13], glass [14], dough [15], polymeric textile materials [16], geophysics [17] and biological materials [18]. The spectrum is not directly measurable; therefore, it must be recovered from oscillatory shear or stress relaxation data [2,5]. A number of different methods and algorithms have been proposed during the last five decades for the recovery frequencies was considered. The complementary problem of determining the spectrum of relaxation times based on the stress relaxation data is both analytically and numerically more difficult than the determination of the spectrum of relaxation frequencies, because the relaxation time occurs in the denominator of the exponential function in the spectrum definitional formula. This very task is the subject of this article. Previous studies [18,39,40] have suggested that by selecting an appropriate time-scale factor, a better fit of the model to the measurement data can be obtained.
The objective of the present paper was to develop a model and an identification algorithm for the determination of the continuous relaxation time spectrum based on discrete-time measurements of the relaxation modulus, which, taking into account the ill-posedness of the original problem of the spectrum recovery and the idea of the optimal selection of the time-scale factor, will provide: (a) good approximation of the relaxation spectrum and modulus also due to the best choice of the time-scale factor; (b) smoothness of the spectrum fluctuations, even for noise-corrupted measurements; (c) noise robustness; (d) applicability to a wide range of viscoelastic materials; (e) ease of the implementation of the model and identification algorithm in available computing packages. The idea of the optimal choice of time-scale factor is used here for the first time in the context of relaxation spectrum identification. The goal of this work was the synthesis of the respective model and identification scheme and the analysis of their properties. A further purpose was the numerical verification of the model and algorithm for double-mode Gauss-like distribution used to describe the viscoelastic properties of various materials, in particular different polymers, wood and glass.
A new hierarchical algorithm of noise robust approximation of the continuous spectrum of relaxation times by finite series of power-exponential basis functions is proposed. The components of the relaxation modulus model are given by compact analytical formulas described by the product of power of time and the modified Bessel function of the second kind. The main properties of the basis functions of relaxation spectrum and modulus models have been studied; positive definiteness, upper bounds, monotonicity and asymptotic properties have been examined. Ranges of applicability for different time-scale factors are determined. Since the problem of relaxation spectrum identification is an ill-posed inverse problem, the regularized least-squares identification technique is applied combined with generalized cross-validation to guarantee the stability of the scheme. The quadratic identification index refers to the measured relaxation modulus. The task of determining the best "regularized" model is solved at the lower level of the identification scheme. The appropriate choice of the time-scale factor on the upper level of the scheme ensures the best fit of the relaxation modulus to experimental data.
The optimality conditions are derived both for the identification problem solved at the lower level and the task of the optimal choice of the time-scale factor at the upper level of the scheme. It is proved that the smoothness of the vector of the optimal model parameters implies smoothness of the fluctuations of the relaxation spectrum model. A direct formula and upper and lower bounds for the square integral norm of the smoothed spectrum model are derived. The accuracy of the spectrum model for noisy measurements of the relaxation modulus is studied, and the linear convergence to the model that we would obtain for the noise-free measurements is proved.
To design a numerical algorithm based on the scheme, the computations should be arranged hierarchically in a two-level scheme, i.e., for each iteration of the minimization procedure at the upper level, the whole numerical procedure must be realized for the lower-level task. The complete computational procedure for determining the best model is described. The singular value decomposition technique is applied to simplify the algebraic computations. The identification scheme can be easily implemented in available computing environments. The numerical studies, especially the strong smoothing of the relaxation spectrum models in the example of a double-mode Gauss-like spectrum, with a simultaneous very good fit to the experimental data of the relaxation modulus models, suggests the need to explore the applicability of the regularized weighted least-squares [41] in the lower-level identification task. This will be the subject of further work.
In Appendix A, the proofs and derivations of some mathematical formulas are given. Some tables have been moved to Appendix B to increase the clarity of the article.

Relaxation Time Spectrum
The uniaxial, non-aging and isothermal stress-strain equation for a linear viscoelastic material is represented by a Boltzmann superposition integral [2]: where u is the past time variable in the range −∞ to the present time t. In Equation (1), variables σ(t) and ε(t) denote, respectively, the stress and strain at time t, and G(t) is the linear relaxation modulus. Modulus G(t) is given by [2,5]: where H(τ) characterizes the distributions of relaxation times τ. The continuous relaxation spectrum H(τ) is a generalization of the discrete Maxwell spectrum [2,5] to a continuous function of the relaxation times τ. The problem of relaxation spectrum identification is the problem of solving a system of Fredholm integral equations of the first kind (2) obtained for discrete-time measurement data. In this paper, time measurements of the relaxation modulus are considered. This problem is ill-posed in the Hadamard sense [42], i.e., small changes in measured relaxation modulus can lead to arbitrarily large changes in the determined relaxation spectrum. As a remedy, some reductions in the set of admissible solutions or the appropriate regularization of the original problem are used. Both the techniques are applied here.

Models
Assume that H(τ) ∈ L 2 (0, ∞), where L 2 (0, ∞) is the space of real-valued squareintegrable functions on the interval (0, ∞). It is known that the set of the linearly independent functions e −ατ , τe −ατ , τ 2 e −ατ , . . . form a basis of the space L 2 (0, ∞) [43] (p. 125); here, α is a positive time-scaling factor. Since the maximum of the function h k (τ, α) = τ k e −ατ grows rapidly with k, it is convenient to expand the relaxation spectrum into a series of scaled basis functions with the first function as follows: where g k are real model coefficients.
It is practical to replace the infinite summation in Equation (5) with a finite one of K first terms, i.e., to approximate the spectrum H(τ) by a model of the form where the lower index is the number of model summands. Then, according to (2), the respective model of the relaxation modulus is described by: where the functions For computational purposes, function h 0 (τ, α) is defined by (4). However, since most mathematicians agreed that 0 0 = 1 [44], the general Formula (3) can also be applied in further analysis for k = 0.
The basis functions φ k (t, α) (8) of the modulus model (7) are given by a compact analytical formula specified by the following theorem proved in Appendix A.1.
Theorem 1. Let α > 0, k ≥ 0 and t > 0. Then, the basis functions φ k (t, α) (8) are given by: where K k (x) is the modified Bessel function of the second kind [45,46] of integer order k.
The first Function (9) is as follows The modified Bessel functions of the second kind, and especially K 0 (x) [47], have many applications in science and engineering, for example, in physics to describe the flow of magneto-hydrodynamic (MHD) viscous fluid in a Darcy-type porous medium [48], in engineering to derive a closed analytical form of the model of a axial-flux permanent magnet machine with segmented multipole-Halbach PM array [49] and to describe the perunit-length internal impedance of two-layer cylindrical conductors [50]. The applications of the Bessel functions in the description of the dynamic response of a mono-pile foundation in homogeneous soil and varied layered soil-rock conditions under horizontal dynamic loads [51], to obtain a fully coupled poroelastic solution for spherical indentation into a half space with an impermeable surface when the indenter is subjected to step displacement loading [52] and to express a distribution of the traveling distance in heterogeneous populations [53] come from material science and ecology.
Five first basis functions h k (τ, α) (3) are shown in Figure 1 for two different values of the time-scaling factor α. Figure 2 shows the related functions φ k (t, α) (9). The basis functions h k (τ, α) and φ k (t, α) are dimensionless. It is seen from Figure 1 that the maximum of each scaled basis function h k (τ, α) (3) and (4) is equal one; however, the relaxation time t max corresponding to the maximum, for a given parameter α, depends on the index k according to the formula t max = k/α; i.e., grows with k. This means that increasing the number of model components K will allow for good modeling of multimodal spectra, which is confirmed by the example presented in the final part of the paper. Reducing the time-scale factor α shifts the spectrum maxima towards larger relaxation times. From Figure 2, it is seen that the Debye decay monotonicity of basis functions for the relaxation modulus model is in good agreement with the courses of the relaxation modulus obtained in an experiment for real materials; for example: concrete [10] (Figure 13), rubber [12] ( Figure 2), elastic polyacrylamide hydrogels [28] (Figures 2a,b, 4a, A5, A7 and A8a), sugar beet [18] (Figure 1) and several foods [3] (Figures 3-39).

Positive Definiteness of the Basis Functions
The basis functions of the relaxation spectrum and modulus models are positive definite. For the functions ℎ ( , ) (3) and (4), these properties are obvious. Since, according to the property A.1 in [54], the Bessel functions of the second kind ( ) are positive for > 0 and real , the positive definiteness of the functions ( , ) (9) and ( , ) (10) directly result.

Asymptotic Properties of the Basis Functions
For → 0, the following asymptotic formula is found in the literature [45]: for modified Bessel functions of the order > 0. Thus, for → 0, Formulas (11) and (9) imply which means that for near zero, the values of basis functions ( , ) decrease with increasing index ; see Figure 2.

Positive Definiteness of the Basis Functions
The basis functions of the relaxation spectrum and modulus models are positive definite. For the functions ℎ ( , ) (3) and (4), these properties are obvious. Since, according to the property A.1 in [54], the Bessel functions of the second kind ( ) are positive for > 0 and real , the positive definiteness of the functions ( , ) (9) and ( , ) (10) directly result.

Asymptotic Properties of the Basis Functions
For → 0, the following asymptotic formula is found in the literature [45]: for modified Bessel functions of the order > 0. Thus, for → 0, Formulas (11) and (9) imply ( , )~( ) , which means that for near zero, the values of basis functions ( , ) decrease with increasing index ; see Figure 2.

Positive Definiteness of the Basis Functions
The basis functions of the relaxation spectrum and modulus models are positive definite. For the functions h k (τ, α) (3) and (4), these properties are obvious. Since, according to the property A.1 in [54], the Bessel functions of the second kind K v (x) are positive for x > 0 and real v, the positive definiteness of the functions φ k (t, α) (9) and φ 0 (t, α) (10) directly result.

Asymptotic Properties of the Basis Functions
For x → 0, the following asymptotic formula is found in the literature [45]: for modified Bessel functions of the order k > 0. Thus, for t → 0 , Formulas (11) and (9) imply which means that for t near zero, the values of basis functions φ k (t, α) decrease with increasing index k; see Figure 2. For argument x → ∞ , the asymptotic exponential formula holds [54] From (9) and (12), the next asymptotic formula follows for large t and k ≥ 1 while for t → ∞ and the first basis function, we have The multiplication by power function √ αt k− 1 4 in (13) causes that the greater k is, the slower the basis function φ k (t, α) decreases, which is seen in Figure 2. The first basis function φ 0 (t, α) decreases faster than the exponential function, which is also confirmed by the analysis of the course of the basis functions in Figure 2. By the asymptotic Formulas (13) and (14), the basis functions tend to 0 as t → ∞ ; i.e., for an arbitrary, α > 0 and k = 0, 1, 2, . . .

Upper Bounds for the Basis Functions
In [54], Corollary 3.4, the following inequality is proved: From Theorem 3.1 in [54], for k ≥ 1 we have This inequality applied into (9) gives the following upper bound Note that having in mind the positive definiteness of the basis functions φ k (t, α), the limit (15) for the first function φ k (t, α) yields from (16), while, for any k ≥ 1, can be derived from (17).

Monotonicity of the Basis Functions
As indicated above, the basis functions h k (τ, α) (3) for k ≥ 1 have a global maximum equal to 1 for the relaxation time τ = kα −1 , while the first function h 0 (τ, α) (4) is monotonically decreasing.
Since the basis function φ k (t, α) (9) is the product of a monotonically decreasing Bessel function K k (x) [54] of increasing argument 2 √ αt and a monotonically increasing power which holds for any real-valued order v, to (9), gives for integer k ≥ 1 whereas, having in mind the positive definiteness of the Bessel function K k−1 2 √ αt for all t > 0, we immediately conclude that the function φ k (t, α) is monotonically decreasing; see Figure 2. For the first basis function φ 0 (t, α) (10), the next differentiation formula is [54] (Equation (A.14)): which is satisfied for any real v, implies which shows that φ 0 (t, α) is also a monotonically decreasing function.

Ranges of Applicability
In the models, the parameter α > 0 is a time-scaling factor. The following rule holds: the lower the parameter α, the greater the relaxation times. The above is illustrated by Figures 1 and 2. Through the optimal choice of the scaling factor, the best fit of the model to the experimental data is achieved, which will be the subject of study in Section 3.2.
Following [40], on the basis of the relaxation modulus course, the range of applicability is specified as the time t, for which the first K basis functions φ k (t, α) no longer permanently exceed; i.e., for any θ > t, ε = 0.5% of its maximum value. Specifically, where Similarly, in [40], the range of applicability specified directly for the relaxation times τ was defined on the basis of the variability in the basis functions h k (τ); i.e., The times t app (α) (20) and τ app (α) (21) for different values of α are summarized in Table 1 for K = 5 and K = 12. The same data for K = 6, . . . , 11 are given in Table A1 in Appendix B.  (20) and τ app (α) (21) of the applicability intervals 0, t app (α) and 0, τ app (α) are given.
A review of the data from these tables shows that for any fixed α, both τ app (α) and t app (α) grow almost linearly with the number of model summands K.

Least-Squares Regularized Identification
Identification consists of the selection, within the chosen class of models given by (6) and (7), of such a model that ensures the best fit to the measurement results. Suppose a certain identification experiment (stress relaxation test [2,3]) resulted in a set of measurements of the relaxation modulus It is assumed that the number of measurements N ≥ K. As a measure of the model (7) accuracy, the quadratic index is taken where g K = g 0 · · · g K−1 T is an K-element vector of unknown coefficients of the models (6) and (7). The identification index (22) is rewritten in the compact form as where and · 2 denotes the square norm in the real Euclidean space R N . Thus, the optimal identification of the relaxation spectrum in the class of models defined by (6) and (7) consists of determining the model parameter g K through solving the following optimization task: The matrix Φ N,K (α) is usually ill-conditioned. In consequence, the problem (25) is ill posed in the sense of Hadamard [42]. The parameter g K minimizing identification index (23) is not unique, and even the normal (with the lowest Euclidean norm) solution of (25) is a non-continuous and unbounded function of the measurement vector G N . Therefore, when the data are noisy, even small changes in G N would lead to arbitrarily large artefacts in optimal model parameters g K . To deal with the ill-posedness, Tikhonov regularization [55], replacing the original ill-posed problem (25) by a nearby problem with a modified square functional of the form: where λ > 0 is a regularization parameter, can be used. The regularized task (26) is wellposed; that is, the solution always exists, is unique and continuously depends on both the matrix Φ N,K and on the measurement data G N . The parameter vector solving (26) is given by: where I K,K is the K-dimensional identity matrix. Following [40], the generalized cross-validation GCV [42,56] is applied, according to which, the best regularization parameter is [42,56]: where the GCV functional is defined by [56] with the matrix Here, Ξ(λ, α)G N is the residual vector for the regularized solution (27); tr[Ξ(λ, α)] denotes the trace of the symmetric matrix Ξ(λ, α). The optimization problem in (28) has a unique solution, and the resulting parameter g λ GCV (α) K (α) differs the least from the normal solution of the problem (26) that we would obtain for the ideal (not noise-corrupted) measurements of the relaxation modulus [56].

Results and Discussion
In this section, the necessary optimality condition for the regularized identification task with GCV's choice of regularization parameter is given in the form of an algebraic nonlinear equation. Next, the problem of the choice of the optimal scale factor is stated, and the respective necessary optimality condition is derived. A hierarchical two-level identification scheme with the optimal choice of the scale factor is proposed. The numerical realization and the application of a singular value decomposition technique are discussed. A complete computational procedure is outlined. The analysis of the smoothing of the model and model accuracy for noisy measurements of the relaxation modulus is presented.

Necessary Optimality Condition
The GCV functional (29) is a differentiable function of the regularization parameter λ. The necessary optimality condition for the minimization task solved in (28) is derived in Appendix A.3.

Theorem 2.
The optimal regularization parameter λ GCV (α) solves the following equation: where In Appendix A.4, the following formula: is developed, describing the derivative of the GCV functional (29) minimized in (28) directly as a function of matrices G N and Ψ(α).

Choice of the Time-Scale Factor
The optimal regularization parameter λ GCV (α) and matrix Φ N,K (α) depend on the time-scale factor. Therefore, the optimal model parameter and the optimal identification index also depend on α. By the optimal choice of the scaling factor, the best fit of the model to the experimental data can be achieved. By (27), (30) and (34), and having (A9) in mind, the index Q Nopt (α) (35) is expressed by the following formula: The smaller the index Q Nopt (α), the smaller the model error results. Thus, the problem of the choice of the best time-scale factor α opt takes the form Before we state the necessary optimality condition of the task (37), we prove the following result concerning the basis functions φ k (t, α) as the functions of the time-scale factor. The proof is given in Appendix A.5. Theorem 3. Let α > 0 and t > 0. Then, for k ≥ 1, derivatives of the basis functions φ k (t, α) (9) with respect to the parameter α are given by: For k = 0, the following formula holds: Thus, positive definite functions φ k (t, α) are monotonically decreasing functions of α for any fixed t > 0.
The following necessary optimality condition is derived in Appendix A.6: The optimal time-scale factor α opt satisfies the following equation: where derivative dλ GCV (α) dα is given by the equation with symmetric matrices where Ψ(α) is defined by (32) and

Two-Level Identification Scheme
To find the optimal time-scale factor α opt and the optimal model of the relaxation time spectrum, the following two-level scheme is applied:

Upper Level
Find the time-scale factor α opt > 0 minimizing identification index Q Nopt (α) (36), i.e., solving optimization task (37). Take where g λ GCV K α opt is defined by (34) as a parameter of the best model of the relaxation time spectrum.
Having the optimal model parameter g K , the optimal model of the relaxation spectrum is determined according to the formula resulting directly from (6): where g k are elements of the vector g K .
To design numerical realization of the scheme, we need:
An iterative scheme for solving the upper-level problem (37) of choosing the best time-scale factor.
For any given parameter α, the GCV function V GCV (λ, α) (29) is differentiable with respect to λ. Partial derivative ∂V GCV (λ,α) ∂λ is given by (33) as a function of the experiment data G N and the matrix Ψ(α) (32), which depends on the time instants t i used in the relaxation experiment and on the time-scaling factor. An arbitrary gradient optimization method can be implemented to solve the GCV minimization task (28). Additionally, an arbitrary gradient method can be used to solve optimization problem (37); the derivative of the index Q Nopt (α) (36) is described by Formula (A24) derived in Appendix A.6. However, the optimal parameter α opt can be also found by solving the necessary optimality condition from Theorem 4, i.e., the two scalar algebraic Equations (40) and (41), in fact.

Algebraic Background of the Identification Scheme
Formulas (27)-(30), fundamental for lower-level optimization task (28), are elegant but generally unsuitable for computational purposes. Following [40], the singular value decomposition (SVD) technique [57] will be applied. Let SVD of the N × K dimensional matrix Φ N,K (α) take the form [57]: where Taking advantage of the diagonal structure of Σ(α) and orthogonality of the matrices V(α) and U(α), it may be simply proved that the parameter g λ K (α) (27) is given by where K × N diagonal matrix Λ λ (α) is as follows: Using SVD (47) and introducing N dimensional vector Y(α) = U(α) T G N , the GCV function (29) is also expressed by a convenient analytical formula: as a function of the singular values σ i (α) and elements y i (α) of the vector Y(α). Similarly, the identification index Q Nopt (α) (36) minimized in the upper level of the scheme is expressed using the SVD (47). Since, by virtue of (47) and (32) we have

Computational Algorithm for Model Identification
To design a numerical algorithm of the scheme, the communication between the levels should also be resolved. The computations must be arranged hierarchically in a two-level structure, i.e., for each iteration of the minimization procedure at the upper level, the whole numerical procedure must be realized for the lower-level GCV task (28). The complete computational procedure for determining the optimal model is given below.
Step 1: Determine the optimal regularization parameter λ GCV α opt in the following twolevel computations.
Step 1.0: Choose the initial point α 0 for the numerical procedure applied to solve the upper-level task (37).
Step 1.1: Let α m be the m-th iterate in the numerical procedure chosen to solve the upper-level task (37). For α = α m , solve the lower-level minimization task (28) according to the chosen numerical optimization procedure and determine the regularization parameter λ GCV (α m ). The algebraic formula V GCV (λ, α) (50) is applied.
Step 1.2: Using λ GCV (α m ), compute, according to the numerical procedure selected to solve the upper-level task (37), with the index Q Nopt (α) described by (51), the new parameter α m+1 , which is the next approximation of α opt . If for α m+1 the stopping rule of the chosen numerical procedure is satisfied, i.e., where ε 1 and ε 2 are preselected small positives, put α opt = α m as the optimal time-scale factor, λ GCV (α m ) as λ GCV α opt , and go to Step 2. Otherwise, return to Step 1.1 and continue the computations for α = α m+1 .
Step 2: Compute the vector of the optimal model parameters g K according to (45) and the best model of the relaxation spectrum H Kopt (τ) given by (46).

Remark 1.
The appealing feature of the scheme is that only the values of λ GCV (α m ), not the related parameters vector g λ GCV K (α m ) (34), are used for α m in successive iterations of the numerical procedure solving the upper-level task (37).

Remark 2.
The regularization parameter λ GCV (α m ) resulting from the lower-level minimization task (28) in each iteration of the upper level is the solution of the GCV problem (28) for current α m . Thus, the respective vector g λ GCV K (α m ) (34) can be treated as an approximate solution of the overall identification problem.

Analysis
Recently, a class of robust algorithms of the approximation of the continuous s trum of relaxation frequencies by finite series of orthonormal functions has been d oped and analyzed in detail in [40]. However, these results were derived using th thogonality of the basis functions, so they are not applicable here. Therefore, both analysis of the smoothing of the model and model accuracy for noisy measuremen the relaxation modulus must be carried out anew here.

Analysis
Recently, a class of robust algorithms of the approximation of the continuous spectrum of relaxation frequencies by finite series of orthonormal functions has been developed and analyzed in detail in [40]. However, these results were derived using the orthogonality of the basis functions, so they are not applicable here. Therefore, both the analysis of the smoothing of the model and model accuracy for noisy measurements of the relaxation modulus must be carried out anew here.

Smoothness
The purpose of regularization relies on the stabilization of the resulting vector g λ K (α) (27). Since the basis functions h k (τ, α) (3) and (4) are such that h k (τ, α) ≤ 1 for any arguments, the following inequality holds for an arbitrary time-scale factor, which means that the smoothing of the vector of model parameters results in the limitation of the respective relaxation spectrum. The mechanism of the vector g λ K (α) (27) stabilization via Tikhonov regularization is explained in many papers, for example, [20,40,55]. The following rule holds: the greater the regularization parameter λ is, the more highly bounded the fluctuations of the vector g λ K (α) are; see [40].
The norm H K (τ, α) 2 is a measure of smoothing of the relaxation spectrum model, where · 2 also means the square norm in L 2 (0, ∞). In Appendix A.7, the following proposition is proved: Proposition 1. For an arbitrary time-scale factor α and arbitrary vector of model parameters g K , we have where K × K symmetric real matrices, Γ(α) described by (A39) and Γ 1 given by are a positive definite; matrix Γ(α) for an arbitrary time-scale factor α > 0.
The matrix Γ 1 is independent of the time-scale factor; only the multiplier 1 2α in the last expression of (52) depends on α. Since for any symmetric non-negative definite matrix, the eigenvalues and singular values are identical, by virtue of the Rayleigh-Ritz inequalities [58] (Lemma I): which hold for any x R m and any symmetric matrix X = X T R m,m , where λ min (X) and λ max (X) are minimal and maximal eigenvalues of the matrix X, Equation (52) implies the following estimations: where σ 1 (Γ 1 ) and σ min (Γ 1 ) denote the largest and the minimal singular values of matrix Γ 1 (53). Thus, the next result is derived.

Proposition 2.
For an arbitrary time-scale factor α and arbitrary vector of model parameters g K , the following inequalities hold: The square roots of the singular values σ 1 (Γ 1 ) and σ min (Γ 1 ) for K = 5, 6, . . . 12 are summarized in Table 2. However, the lower bound of this norm is useful only for small K and small time-scale factors. Since σ 1 (Γ 1 ) grows with K, the greater the number of model summands there are, the greater the time-scaling factor should be to achieve pre-assumed multiplier 1 √ 2α σ 1 (Γ 1 ) in the estimation (55). In [59], a decreasing sequence of upper bounds for the largest singular value of a non-negative definite square matrix is constructed, given by Equation (19) in [59]. This result applied to the K × K matrix Γ 1 means that is a decreasing sequence of upper bounds for σ 1 (Γ 1 ). The right inequality in (55) can be weakened to the following:
To summarize, the smoothness of the optimal solution g λ K (α) (27) of discrete regularized problem (26) guarantees that the fluctuations in the respective spectrum of relaxation, in particular the resulting spectrum of relaxation H K (τ) (46), are also bounded. The timescale factor α also affects the smoothness of the spectrum model.

Convergence and Noise Robustness
The relaxation spectrum recovery from experimental data is an inverse ill-posed problem, in which the identification index refers to the measured relaxation modulus, but not directly to the unknown relaxation spectrum H(τ). Therefore, we cannot estimate the model error H(τ) − H K (τ, α) 2 directly. As a reference point for the model H K (τ, α) (6) and (27), we will consider the model of the spectrum that we would obtain for the same time-scale factor α and regularization parameter λ on the basis of ideal (undisturbed) measurements of the relaxation modulus: where g λ K (α) is the vector of the regularized solution of (26) given by (compare (27)) for noise-free measurements of relaxation modulus G N = G(t 1 ) · · · G(t N ) T ; c.f., Equation (24). In Appendix A.8, the following estimations are derived:

Proposition 4.
For an arbitrary time-scale factor α and arbitrary regularization parameter λ, the error between the relaxation time spectrum models H K (τ, α) (6) and (27) and H K (τ, α) (58) and (59) is estimated using the following inequality: where z N = z(t 1 ) · · · z(t N ) T is the vector of measurement noises.
According to inequality (60), the accuracy of the spectrum approximation depends both on the measurement noises and regularization parameter and on the singular values σ 1 (α), . . . , σ r(α) (α) of the matrix Φ N,K (α) (47) depending on the time-scale factor. By (60), the spectrum H K (τ, α) tends to the noise-free spectrum H K (τ, α) in each relaxation time τ, at which they are both continuous, linearly with respect to the norm z N 2 , as z N 2 → 0 .

Example
Consider a viscoelastic material of the relaxation spectrum described by the doublemode Gauss-like distribution:  (Figures 4b and 8b), polyacrylamide gels [28] ( Figure A4), glass [61] (Figure 2), wood [13] and sugar beet [18] (Figure 2). It is shown in Appendix A.9 that the corresponding 'real' relaxation modulus is where the complementary error function er f c(x) is defined by [62]: In the experiment, N = 5000 sampling instants t i were generated with the constant period in the time interval T = [0, 1550] seconds selected in view of the course of the modulus G(t) (62). Additive measurement noises z(t i ) were selected independently by random choice with uniform distribution on the interval [−0.005, 0.005] Pa. The 'real' spectrum (61), modulus (62) and the basis functions h k (τ, α), φ k (t, α) were simulated in Matlab R2022a using the special functions besselk and erfc. For the singular value decomposition procedure, svd was applied. New calculation algorithms of the modified Bessel function of the second kind are constantly being developed; recently, an algorithm for parallel calculation was proposed [63].

Optimal Models
For K = 3, 4, . . . , 12, the optimal time-scaling factors α opt were determined via the proposed two-level identification scheme and are given in Table 3 together with the related regularization parameters λ GCV α opt . Next, the vectors of optimal model parameters g K = g λ GCV K α opt (45) were computed and are given in Table A3 in Appendix B; the elements of these vectors are both negative and positive. The square norms g K 2 and H Kopt (τ) 2 are also enclosed in Table 3 as the measures of the solution smoothness. The norm H Kopt (τ) 2 of the optimal model H Kopt (τ) (46) was determined through (52) based on g K . For the 'real' spectrum H(τ) (61), the norm H(τ) 2 = 19.2562 Pa. The approximation error between H(τ) (61) and their models H Kopt (τ) (46) was estimated via relative integral error ER1(K), defined by: which is expressed in a percentage in the penultimate column of Table 3. For the bimodal spectrum, the values of this error of the order of 33% are not surprising in the context of the ill-conditioned inverse problem. To compare the approximations H Kopt (τ) (46) obtained for successive integer K, the square index ER2(K) defined by the distance: is applied; see the last column of Table 3. The optimal indices Q Nopt α opt (37) are given in Table 3, too. The optimal models H Kopt (τ) (46) and the 'real' spectrum H(τ) (61) are plotted in Figure 4 for K = 3, 4, . . . 12. The analysis of the data from Table 3 and, in particular, the analysis of Figure 4b-d show that increasing the number of model components above K = 8 does not significantly affect the course of the model H Kopt (τ) and its accuracy. This is emphasized by the values of the ER1(K) (64) and ER2(K) (65) indices; in particular, the integral square error ER2(K) between successive H Kopt (τ) and H (K+1)opt (τ) spectrum approximations, which relates to the square norm H(τ) 2 2 for K ≥ 8 does not exceed 0.046%, and for K ≥ 10, it is of the order of 0.009%. Table 3. The parameters of the optimal models in the example: optimal time-scale factors α opt , upper α min and lower α max bounds of the interval [α min , α max ] of suboptimal time-scale factors defined by inequality (66) for ε α = 0.1, regularization parameters λ GCV α opt , optimal identification indices Q Nopt α opt defined in (37), square norms: g K 2 of the vector of optimal model parameter g K (45) and H Kopt (τ) 2 of the optimal relaxation spectrum model H Kopt (τ) (46), relaxation spectrum approximation error ER1(K) defined in (64)  In Figure 5, the optimal models of the relaxation modulus G K t, α opt computed for g K (45) according to (7) are plotted, where the measurements G(t i ) of the 'real' modulus G(t) (62) are also marked. The optimal models G K t, α opt have been well fitted to the experimental data, as indicated by the mean-square model errors Q Nopt α opt /N, which for 8 ≤ K ≤ 12 vary in the range from 4.889·10 −9 Pa 2 to 5.0482·10 −9 Pa 2 . Thus, models G K t, α opt for different K practically coincide with the measurement points and with each other; see Figure 5.

Optimal and Sub-Optimal Time-Scale Factors
The identification index Q Nopt (α) (35) minimized in the upper-level task (37) as a function of time-scale factor α is plotted in Figure 6a-d for K = 6, 8, 10, 12. The optimal parameters α opt are marked. The analysis of these plots suggests that there is such a neighborhood of α opt , namely a closed interval [α min , α max ], that α min ≤ α opt ≤ α max and for each α ∈ [α min , α max ], the identification index Q Nopt (α) differs from the minimal Q Nopt α opt not more than ε α ·Q Nopt α opt , i.e., where ε α is a small positive number. Inequality (66) means the deterioration of the model error is not greater than ε α percent of the optimal Q Nopt α opt . Any parameter α from the interval [α min , α max ] is a suboptimal time-scale factor. In Figure 6, ε α = 0.1, which means a 10% level of sub-optimality, is assumed, and the values of α min and α max are marked on the small subfigures. They are also given in Table 3. With the increase in the number of model components, the optimal parameter α opt increases and the range of sub-optimal time-scale factors shifts. For an exemplary number of model components, K = 9, the optimal model H Kopt (τ) (46) and models H K (τ, α) (6), optimal in the sense of lower-level task (28) for suboptimal parameters α min and α max , are plotted in Figure 7a; the 'real' spectrum H(τ) (61) is presented, too. The corresponding optimal regularization parameters are as follows: λ GCV (α min ) = 0.0167 and λ GCV (α max ) = 0.0094. For respective vectors of optimal model parameters, we have g λ GCV K (α min ) 2 = 2.5114 Pa and g λ GCV K (α max ) 2 = 2.5119 Pa. The norms of the relaxation spectra are H K (τ, α min ) 2 = 17.4472 Pa and H K (τ, α max ) 2 = 20.2069 Pa. In Figure 7b, the 'real' modulus G(t) (62) and the models: G K t, α opt (7) and (45) and G K (t, α min ), G K (t, α max ) computed for g λ GCV K (α min ), g λ GCV K (α max ) according to (7) are plotted. However, the relaxation moduli are almost identical (Figure 7b), the spectrum models differ (Figure 7a), which emphasizes the importance of the time-scale factor optimal selection. Too strong smoothing of the relaxation spectrum models, with a simultaneous very good fit to the experimental data of the relaxation modulus models, indicates the need to modify the quality index in the regularized task (26). For example, the square term g K 2 2 in the second component of the objective function can be replaced by the quadratic form g T K Wg K , and a positive definite weight matrix W or regularized weighted least-squares approach [41] can be applied. This will be the subject of further work.
which is expressed in a percentage in the penultimate column of Table 3. For the bimodal spectrum, the values of this error of the order of 33% are not surprising in the context of the ill-conditioned inverse problem. To compare the approximations ( ) (46) obtained for successive integer , the square index 2( ) defined by the distance: is applied; see the last column of Table 3. The optimal indices (37) are given in Table 3, too. The optimal models ( ) (46) and the 'real' spectrum ( ) (61) are plotted in Figure 4 for = 3,4, … 12. The analysis of the data from Table 3   In Figure 5, the optimal models of the relaxation modulus , computed for (45) according to (7) are plotted, where the measurements ̅ ( ) of the 'real' modulus ( ) (62) are also marked. The optimal models , have been well fitted to the experimental data, as indicated by the mean-square model errors ⁄ , which for 8 ≤ ≤ 12 vary in the range from 4.889 • 10 to 5.0482 • 10 . Thus, models , for different practically coincide with the measurement points and with each other; see Figure 5.

Optimal and Sub-Optimal Time-Scale Factors
The identification index ( ) (35) minimized in the upper-level task (37) as a function of time-scale factor is plotted in Figure 6a-d for = 6,8,10,12. The optimal parameters are marked. The analysis of these plots suggests that there is such a neighborhood of , namely a closed interval [ , ], that ≤ ≤ and for each ∈ [ , ], the identification index ( ) differs from the minimal not more than • , i.e., where is a small positive number. Inequality (66) means the deterioration of the model error is not greater than percent of the optimal . Any parameter from the interval [ , ] is a suboptimal time-scale factor. In Figure 6, = 0.1, which means a 10% level of sub-optimality, is assumed, and the values of and are marked on the small subfigures. They are also given in Table 3. With the increase in the number of model components, the optimal parameter increases and the range of sub-optimal time-scale factors shifts. For an exemplary number of model components, = 9, the optimal model ( ) (46) and models ( , ) (6), optimal in the sense of lower-level task (28) for suboptimal parameters and , are plotted in Figure 7a; the 'real' spectrum ( ) (61) is presented, too. The corresponding optimal regularization parameters are as follows:  Figure 7b, the 'real' modulus ( ) (62) and the models: , and (45) and ( , ) , ( , ) computed for ( ), ( ) according to (7) are plotted. However, the relaxation moduli are almost identical (Figure 7b), the spectrum models differ (Figure 7a), which emphasizes the importance of the time-scale factor optimal selection. Too strong smoothing of the relaxation spectrum models, with a simultaneous very good fit to the experimental data of the relaxation modulus models, indicates the need to modify the quality index in the regularized task (26). For example, the square term ‖ ‖ in the second component of the objective function can be replaced by the quadratic form , and a positive definite weight matrix or regularized weighted least-squares approach [41] can be applied. This will be the subject of further work.

Conclusions
In this paper, a new robust hierarchical algorithm for the identification of the relaxation time spectrum, which combines the technique of an expansion of a function into a series of basis functions with the least-squares regularized identification and the optimal choice of the time-scale factor, has been derived. The task of determining the best 'regularized' model was solved at the lower level, while the optimal time-scale factor was selected on the upper level of the identification scheme. The continuous spectrum of relaxation times was approximated by finite series of power-exponential basis functions, while the components of the relaxation modulus model were proven to be described by the product of power of time and the modified Bessel function of the second kind. In the present paper, the problem of the optimal choice of the time-scale factor to ensure the best fit of the model to experimental data has been formulated and solved for the first time in the context of the relaxation spectrum identification.
The necessary optimality conditions both for the optimal regularized least-squares identification task and the problem of the optimal selection of the time-scale factor were derived in the form of nonlinear algebraic equations. The main properties of the basis functions of the relaxation spectrum and modulus models, their positive definiteness, convenient upper bounds, monotonicity, asymptotic properties and wide range of applicability for different time-scale factors indicated the possibility of using the proposed model and identification algorithm to determine the spectrum of a wide class of viscoelastic materials.
The overly strong smoothing of the relaxation spectrum models in the example, with a very good fit to the experimental data of the relaxation modulus models, indicates the need to modify the quality index in the lower level identification task. An introduction of a weight matrix in the second component of the objective function or a direct application of regularized weighted least-squares should be investigated. Another solution is the selection of such a spectrum model which guarantees the assumed level of smoothing and optimal adjustment to the relaxation modulus measurement data. These approaches will be the subject of further work.
The presented scheme of the relaxation spectrum identification can be easily modified for retardation spectrum recovery from creep compliance measurements obtained in the standard creep test. Relaxation time spectra and moduli from the example and the corresponding models for K = 9 summands of the model: (a) 'real' spectrum H(τ) (61) (solid red line) and the models: optimal H Kopt (τ) (46) and suboptimal H K (τ, α min ) and H K (τ, α max ); (b) 'real' modulus G(t) (62) (red points), the optimal approximated model G K t, α opt (7) and (45), the suboptimal models G K (t, α min ) and G K (t, α max ) computed for g λ GCV K (α min ) and g λ GCV K (α max ) according to (7).

Conclusions
In this paper, a new robust hierarchical algorithm for the identification of the relaxation time spectrum, which combines the technique of an expansion of a function into a series of basis functions with the least-squares regularized identification and the optimal choice of the time-scale factor, has been derived. The task of determining the best 'regularized' model was solved at the lower level, while the optimal time-scale factor was selected on the upper level of the identification scheme. The continuous spectrum of relaxation times was approximated by finite series of power-exponential basis functions, while the components of the relaxation modulus model were proven to be described by the product of power of time and the modified Bessel function of the second kind. In the present paper, the problem of the optimal choice of the time-scale factor to ensure the best fit of the model to experimental data has been formulated and solved for the first time in the context of the relaxation spectrum identification.
The necessary optimality conditions both for the optimal regularized least-squares identification task and the problem of the optimal selection of the time-scale factor were derived in the form of nonlinear algebraic equations. The main properties of the basis functions of the relaxation spectrum and modulus models, their positive definiteness, convenient upper bounds, monotonicity, asymptotic properties and wide range of applicability for different time-scale factors indicated the possibility of using the proposed model and identification algorithm to determine the spectrum of a wide class of viscoelastic materials.
The overly strong smoothing of the relaxation spectrum models in the example, with a very good fit to the experimental data of the relaxation modulus models, indicates the need to modify the quality index in the lower level identification task. An introduction of a weight matrix in the second component of the objective function or a direct application of regularized weighted least-squares should be investigated. Another solution is the selection of such a spectrum model which guarantees the assumed level of smoothing and optimal adjustment to the relaxation modulus measurement data. These approaches will be the subject of further work.
The presented scheme of the relaxation spectrum identification can be easily modified for retardation spectrum recovery from creep compliance measurements obtained in the standard creep test. To prove the theorem, we first derive the following integral formula: where v ∈ R. By applying in the integral of the left-hand side of (A1) the substitution √ α τ = √ te y we have: √ αt e −y e y dy, which can be expressed as Since, bearing in mind that cosh(y) is an even function, the integral in (A2) is rewritten as αt cosh(y) cosh(vy)dy, whence, by the following integral representation formula for the Bessel functions [45,46] (Equation (5) on p. 181 in [46]): where x > 0 and v ∈ R, Formula (A1) directly follows. Now, we prove Formula (9). Since, by (3) and (8) the basis functions whence, Formula (9) follows.

Appendix A.2. Algebraic Matrix Properties
In this appendix, a new differential matrix property, used to obtain the optimality conditions and to simplify numerical computations, is derived. First, for convenience, some matrix identities and inequalities known in the literature are also provided.
provided that respective matrices are invertible.

Appendix A.3. Proof of Theorem 2
First, we express Ξ(λ, α) (30) in a more useful equivalent form. By the matrix identity (A5) for λ > 0, we have whereas, by (30) and bearing in mind the notation (32), we obtain Since, by (30), the GCV functional V GCV (λ, α) (29) is expressed as its derivative with respect to λ is given by Now, we determine the partial derivatives which appear in the nominator of the right-hand side of (A10). By Formula (A6), we have where the derivative of the inverse matrix can be found due to property (A7), which, when applied to (A12), results in whereas, after algebraic manipulations, we obtain Now, we can derive the necessary optimality condition. Since, in view of (A9), matrix Ξ(λ, α) is positive definite for any λ > 0, derivative ∂V GCV (λ,α) ∂λ given by the fraction from the right-hand side of (A10) is equal to zero, if and only if its nominator is equal zero to, i.e., By (A18), (A14), (A9), and (A17), we have which is equivalent to whence, by standard algebraic manipulations, Equation (31) follows directly. The theorem is proved.
Appendix A.4. Derivation of the Formula (33) To derive the Formula (33) describing the derivative directly as a function of the matrices Ψ(α) and G N , let us consider the nominator of the right-hand side of (A10), i.e., the function Through tedious algebraic manipulations, the above expression is rewritten as (A20) Now, (A10) combined with (A20) and (A15) yields Formula (33).
Since, for an arbitrary k ≥ 1, the basis function φ k (t, α) (9) is expressed as by Formula (18), we immediately have which is rewritten as whence, in view of (9), Formula (38) follows directly, which completes the proof.